📊 ProcureCast 360: Predictive Spend Forecasting & Supplier Risk Analytics
🔍 Overview
This project simulates an end-to-end procurement analytics pipeline inspired by SAP ERP systems.
It ingests structured procurement data, forecasts spend, and assesses supplier risk using modern data engineering
and machine learning workflows.
The solution demonstrates how enterprises can leverage advanced analytics to optimize procurement decisions,
improve supplier relationships, and drive cost savings.
📂 Data Sources and Context
The pipeline processes synthetic SAP-style procurement tables, closely resembling real-world data such as:
- Purchase Orders (PO): Line items, quantities, net prices, vendors
- Invoices: Invoice amounts, dates, and payment terms
- Vendor Master Data: Supplier profiles, risk scores, and historical performance
This structured data mimics SAP MM module exports (e.g., EKPO, EKKO, LFA1), enabling realistic procurement analytics workflows.
🧭 Approach
The pipeline follows a modular design:
- Data ingestion and preprocessing using Python and pandas
- Spend forecasting with Facebook Prophet time series models
- Supplier risk scoring using Isolation Forest anomaly detection
- ETL orchestration with Apache Airflow
- Storage in Parquet files and relational databases
- Visualization in Power BI dashboards
Each component is containerized via Docker for reproducibility.
⚙️ Methodologies
- Time Series Forecasting: Prophet models to predict future spend by supplier and category
- Anomaly Detection: Isolation Forest to flag potential high-risk suppliers based on transaction patterns
- ETL Pipelines: Airflow DAGs automate ingestion, transformation, and modeling tasks
- Data Aggregation: Grouped metrics by time period, supplier, and material type
🧰 Technologies
- Languages: Python, SQL
- Libraries: pandas, scikit-learn, Prophet
- Platforms: Apache Airflow, Docker, Power BI
- Storage: Parquet, PostgreSQL
📈 Results and Impact
The pipeline successfully processed large simulated procurement datasets, generating accurate spend forecasts
and identifying anomalous supplier behaviors. The Power BI dashboards provided stakeholders with actionable insights
into spending trends and supplier performance.
This approach demonstrates how integrating forecasting and anomaly detection can improve procurement strategy and reduce supply chain risks.
🔮 Future Enhancements
- Integrate real-time streaming ingestion with Apache Kafka
- Incorporate additional risk scoring models using ensemble methods
- Develop automated email alerts for anomalous spend patterns
- Extend dashboards with drill-through supplier profiles